Multi-Agent Pipeline for Sustainable Document Processing

other · 2026-05-20

MADP is an advanced multi-agent framework aimed at streamlining document processing within corporate settings by combining deep learning-driven classification and parsing with extraction from large language models. This architecture includes five distinct agents: Classificator, Splitter, Parser, Extraction, and Validator, and features a Human-in-the-Loop (HITL) system alongside an innovative Prompt Fine Tuning with Feedback Inheritance (PFTFI) method. An operational evaluation of a real-world scenario involving 100,000 invoices annually suggests a possible decrease in Full-Time Equivalent (FTE) needs by around 70%. By January 2026, the system has successfully processed 955 actual documents, achieving significant accuracy through targeted human validation.

Key facts

MADP is a multi-agent architecture for document processing automation.
It combines deep learning classification and parsing with LLM extraction.
The system includes five agents: Classificator, Splitter, Parser, Extraction, Validator.
It features a Human-in-the-Loop (HITL) mechanism.
A novel Prompt Fine Tuning with Feedback Inheritance (PFTFI) approach is used.
Operational analysis on 100,000 invoices per year shows 70% FTE reduction.
Production deployment on 955 documents through January 2026.
The system maintains accuracy through selective human validation.

Entities

—

Sources

arXiv cs.AI — 2026-05-19