OR-VSKC: Synthetic Data Benchmark for Surgical Safety Risks

other · 2026-05-01

A new benchmark called OR-VSKC has been developed by researchers to investigate Visual-Semantic Knowledge Conflicts (VS-KC) in Multimodal Large Language Models (MLLMs) specifically within operating rooms (ORs). To tackle the limitations of real-world OR data due to privacy issues, this benchmark employs a Protocol-to-Pixel Generative Framework, generating 28,190 high-quality synthetic images based on established safety standards. Additionally, it features a subset of 713 images, crafted by experts and validated by multiple professionals. This initiative seeks to enhance the automated detection of surgical safety hazards by bridging the gap where models have safety knowledge but do not utilize it effectively during visual assessments.

Key facts

OR-VSKC is a benchmark for studying Visual-Semantic Knowledge Conflicts in operating rooms.
The benchmark comprises 28,190 high-fidelity synthetic images.
Images are generated via a Protocol-to-Pixel Generative Framework.
The framework is grounded in authoritative safety standards.
A 713-image expert-authored challenge subset is included.
The challenge subset is validated by multiple experts.
The research addresses scarcity and privacy constraints of real-world OR data.
The goal is to improve automated identification of surgical safety risks.

Entities

—

Sources

arXiv cs.AI — 2026-05-01