OpenAI Launches GPT-Realtime-2, Translation, and Whisper Models

ai-technology · 2026-05-07

OpenAI has introduced three new audio models in its API: GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper. GPT-Realtime-2 is the first voice model with GPT-5-class reasoning, featuring parallel tool calls, longer context (128K), adjustable reasoning effort, and improved recovery behavior. It scored 15.2% higher on Big Bench Audio than its predecessor. GPT-Realtime-Translate supports live translation from 70+ input languages into 13 output languages, with Deutsche Telekom testing it for multilingual customer support. GPT-Realtime-Whisper provides low-latency streaming speech-to-text. Pricing: GPT-Realtime-2 at $32/1M audio input tokens and $64/1M audio output tokens; Translate at $0.034 per minute; Whisper at $0.017 per minute. The models are available in the Realtime API, with full EU Data Residency support and enterprise privacy commitments. Developers can test them in the Playground.

Key facts

OpenAI launched GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper on May 5, 2026.
GPT-Realtime-2 has GPT-5-class reasoning, 128K context, and adjustable reasoning effort.
GPT-Realtime-2 scored 15.2% higher on Big Bench Audio than GPT-Realtime-1.5.
GPT-Realtime-Translate supports 70+ input languages and 13 output languages.
GPT-Realtime-Whisper is a streaming speech-to-text model.
GPT-Realtime-2 pricing: $32/1M audio input tokens, $64/1M audio output tokens.
GPT-Realtime-Translate costs $0.034 per minute; Whisper costs $0.017 per minute.
Models support EU Data Residency and enterprise privacy commitments.

Entities

Institutions

OpenAI
Zillow
Deutsche Telekom
Priceline
Vimeo

Locations

India
Japan
EU

Sources

OpenAI Blog — 2026-05-07