ARTFEED — Contemporary Art Intelligence

AI Model InVitroVision Describes Embryo Development from Limited Data

ai-technology · 2026-04-25

Researchers have developed InVitroVision, a fine-tuned version of the PaliGemma-2 vision-language model, to automatically generate natural language descriptions of embryo morphology and development from time-lapse images. Using only 1,000 publicly available embryo images and corresponding captions, the model outperformed ChatGPT 5.2 and base models in overall metrics. The study, published on arXiv (2604.21061), demonstrates that foundational vision-language models can generalize to IVF tasks with limited annotated data, potentially improving consistency and standardization in IVF decision-making. Performance improved with larger training datasets.

Key facts

  • InVitroVision is a fine-tuned version of PaliGemma-2
  • Trained on 1,000 publicly available embryo time-lapse images and captions
  • Describes embryo morphology, cell cycle, and developmental stage
  • Outperformed ChatGPT 5.2 and base models
  • Performance improves with larger training datasets
  • Study published on arXiv (2604.21061)
  • Demonstrates potential of vision-language models for IVF tasks with limited data
  • Aims to improve consistency and standardization in IVF decisions

Entities

Institutions

  • arXiv

Sources