SKG-Eval: Stateful Evaluation of Multi-Turn Dialogue via Incremental Semantic Knowledge Graphs
The paper identified as arXiv:2605.16650 presents SKG-Eval, a framework that is both quasi-deterministic and interpretable for assessing multi-turn dialogue systems. In contrast to current automatic evaluators, such as LLM-as-a-judge and embedding-based metrics that utilize flat or isolated turn representations, SKG-Eval conceptualizes dialogue as a dynamic Semantic Knowledge Graph (SKG) encompassing entities, relations, and commitments over multiple turns. It updates the graph incrementally via structured triple extraction and generates three complementary signals: local relevance, historical consistency, and entity coherence. This innovative method seeks to identify long-term issues like contradiction, topic drift, and entity inconsistency, which are often overlooked by existing approaches. This announcement is categorized as cross-type on arXiv and is published under the identifier 2605.16650v1.
Key facts
- SKG-Eval is a framework for evaluating multi-turn dialogue systems.
- It uses an evolving Semantic Knowledge Graph (SKG) of entities, relations, and commitments.
- The framework computes three signals: local relevance, historical consistency, and entity coherence.
- Existing evaluators like LLM-as-a-judge and embedding-based metrics are less effective at detecting long-range issues.
- SKG-Eval is quasi-deterministic and interpretable.
- The paper is published on arXiv with identifier 2605.16650v1.
- The announcement type is cross.
- The framework addresses contradiction, topic drift, and entity inconsistency.
Entities
Institutions
- arXiv