AgenticEval: Self-Evolving Safety Evaluation for LLMs

ai-technology · 2026-05-16

A new multi-agent framework called AgenticEval proposes a paradigm shift in safety evaluation for Large Language Models (LLMs), moving from static benchmarks to a continuous, self-evolving process. The system autonomously ingests unstructured policy documents to generate and perpetually update safety benchmarks. It uses a pipeline of specialized agents and a Self-evolving Evaluation loop that learns from results to create more sophisticated test cases. Experiments show consistent effectiveness in addressing dynamic AI risks and evolving regulations.

Key facts

AgenticEval is a multi-agent framework for LLM safety evaluation.
It reframes evaluation as a continuous and self-evolving process.
The system autonomously ingests unstructured policy documents.
It generates and perpetually evolves a comprehensive safety benchmark.
It uses a synergistic pipeline of specialized agents.
It incorporates a Self-evolving Evaluation loop.
The loop learns from evaluation results to craft more sophisticated test cases.
Experiments demonstrate consistent effectiveness.

Entities

—

Sources

arXiv cs.AI — 2026-05-16