OpenAI's ChatGPT Images 2.0 model demonstrates breakthrough text generation capabilities in AI imagery
On Tuesday, OpenAI introduced its ChatGPT Images 2.0 model, which marks a substantial advancement in AI-generated imagery, especially in accurately depicting text. In contrast to DALL-E 3, which made mistakes such as "enchuita," Images 2.0 excels in producing realistic marketing content. Users of ChatGPT and Codex can access this model, with premium features available for subscribers. It includes "thinking capabilities" for online searches, the ability to generate multiple images from a single prompt, and self-verification, allowing for the creation of intricate assets like comic strips and marketing materials, although these may take several minutes to complete. The model supports non-Latin scripts and achieves detail at resolutions up to 2K. OpenAI has not revealed whether diffusion or autoregressive models are utilized. An API will be launched, with pricing determined by output quality and resolution.
Key facts
- ChatGPT Images 2.0 launched on Tuesday with access for all ChatGPT and Codex users
- The model can accurately generate text within images, unlike previous AI image generators
- Images 2.0 features "thinking capabilities" including web search and self-verification
- The model handles non-Latin scripts like Japanese, Korean, Hindi, and Bengali
- Complex image generation like multi-panel comics takes several minutes to produce
- OpenAI declined to specify whether Images 2.0 uses diffusion or autoregressive models
- The model's knowledge cutoff is December 2025, affecting recent event accuracy
- A gpt-image-2 API will be available with pricing based on quality and resolution
Entities
Artists
- Asmelash Teka Hadgu
Institutions
- OpenAI
- Lesan AI
- TechCrunch
- Microsoft