AI Model InVitroVision Describes Embryo Development from Limited Data

ai-technology · 2026-04-25

Researchers have developed InVitroVision, a fine-tuned version of the PaliGemma-2 vision-language model, to automatically generate natural language descriptions of embryo morphology and development from time-lapse images. Using only 1,000 publicly available embryo images and corresponding captions, the model outperformed ChatGPT 5.2 and base models in overall metrics. The study, published on arXiv (2604.21061), demonstrates that foundational vision-language models can generalize to IVF tasks with limited annotated data, potentially improving consistency and standardization in IVF decision-making. Performance improved with larger training datasets.

Key facts

InVitroVision is a fine-tuned version of PaliGemma-2
Trained on 1,000 publicly available embryo time-lapse images and captions
Describes embryo morphology, cell cycle, and developmental stage
Outperformed ChatGPT 5.2 and base models
Performance improves with larger training datasets
Study published on arXiv (2604.21061)
Demonstrates potential of vision-language models for IVF tasks with limited data
Aims to improve consistency and standardization in IVF decisions

AI Model InVitroVision Describes Embryo Development from Limited Data

Key facts

Entities

Institutions

Sources