GPT-Image-2 Twitter Dataset Tracks AI Imagery After OpenAI Release

ai-technology · 2026-04-30

The GPT-Image-2 Twitter Dataset has been unveiled by researchers, marking the first-ever compilation of images produced by OpenAI's GPT-image-2 model. This dataset is derived from public posts on Twitter/X following the model's launch on April 21, 2026. By utilizing the Twitter API v2 and a comprehensive curation process that incorporates multilingual text heuristics (in English, Japanese, and Chinese), automated browser checks for Twitter's "Made with AI" label, and matching model name variations, the team successfully gathered 10,217 verified GPT-image-2 images from a total of 27,662 entries within a span of six days. The dataset includes analyses on CLIP-based zero-shot subject classification, text legibility (with 82.0% of images featuring recognizable text), and face detection (covering 59.2% of images, totaling 22,583 faces). This release signifies a pivotal moment in the realm of AI-generated visuals, as distinguishing between real photographs and synthetic images becomes increasingly challenging.

Key facts

Dataset sourced from Twitter/X posts after GPT-image-2 release on April 21, 2026
10,217 confirmed GPT-image-2 images from 27,662 records over six days
Multi-stage curation: multilingual heuristics (English, Japanese, Chinese), badge verification, model name matching
82.0% of images contain detectable text (OCR analysis)
59.2% of images contain faces (22,583 total faces)
CLIP-based zero-shot subject taxonomy applied
First published dataset of GPT-image-2 generated images
Boundary between photographic reality and synthetic content increasingly difficult to discern

GPT-Image-2 Twitter Dataset Tracks AI Imagery After OpenAI Release

Key facts

Entities

Institutions

Sources