PStar Framework Reduces Hallucinations in Vision-Language Models for Robotics
Researchers propose Pseudocode-guided Structured Reasoning (PStar), a framework that adaptively selects structured pseudocode reasoning paths to help Vision-Language Models (VLMs) perform flexible, step-by-step reasoning. The approach addresses VLMs' susceptibility to hallucinations, which cause critical failures in robotic automation. PStar designs abstract reasoning functions and a structured pseudocode library representing modular reasoning strategies. A Difficulty Feature Vector (DFV) is introduced to guide path selection. The work aims to improve safety and reliability in physical deployments.
Key facts
- PStar stands for Pseudocode-guided Structured Reasoning.
- The framework is designed for Vision-Language Models (VLMs).
- VLMs are used in robotic automation for parsing commands and perceiving environments.
- Hallucinations in VLMs pose safety and reliability risks.
- PStar uses structured pseudocode reasoning paths.
- It includes a library of abstract reasoning functions.
- A Difficulty Feature Vector (DFV) guides adaptive path selection.
- The research is published on arXiv with ID 2605.19663.
Entities
Institutions
- arXiv