FaithLens AI Model Detects Hallucinations in LLM Outputs with Explanations
FaithLens, a new 8B-parameter AI model, has been developed to identify faithfulness hallucinations in large language model outputs. The system provides both binary predictions and corresponding explanations to enhance trustworthiness. Training data was synthesized using advanced LLMs and filtered for label accuracy, explanation quality, and diversity. After fine-tuning on this curated data, the model was further optimized through rule-based reinforcement learning that rewards both prediction correctness and explanation quality. Results across 12 diverse tasks demonstrate that FaithLens outperforms advanced models like GPT. The research paper was published on arXiv under identifier arXiv:2512.20182v4. This work addresses critical needs in real-world applications including retrieval-augmented generation and summarization. The model's cost-efficient design makes it particularly valuable for practical implementation.
Key facts
- FaithLens detects faithfulness hallucinations in LLM outputs
- Provides binary predictions and explanations simultaneously
- Uses 8B parameters
- Outperforms advanced models on 12 diverse tasks
- Training data synthesized via advanced LLMs
- Optimized with rule-based reinforcement learning
- Published on arXiv as arXiv:2512.20182v4
- Addresses needs in retrieval-augmented generation and summarization
Entities
Institutions
- arXiv