ARTFEED — Contemporary Art Intelligence

New Metric PINK Exposes Over-Correction in Handwritten Math OCR

ai-technology · 2026-04-29

A study from arXiv (2604.22774) reveals that Vision-Language Models (VLMs) frequently over-correct errors when transcribing multi-line handwritten math, hiding mistakes that educational AI should detect. The authors propose PINK (Penalized INK-based score), a semantic evaluation metric using an LLM for rubric-based grading that penalizes over-correction. The research is the first systematic study of multi-line handwritten math OCR, evaluating 15 state-of-the-art models.

Key facts

  • arXiv paper 2604.22774 identifies over-correction in VLMs for handwritten math OCR.
  • PINK metric uses LLM-based rubric grading to penalize over-correction.
  • First systematic study of multi-line handwritten math OCR.
  • 15 state-of-the-art models evaluated.
  • Current benchmarks like BLEU fail for multi-line expressions.
  • Over-correction hides student errors from educational assessment.
  • Prior studies focused on single-line expressions.
  • Study aims to improve educational AI systems.

Entities

Institutions

  • arXiv

Sources