ARTFEED — Contemporary Art Intelligence

OpenAI Launches GPT-Realtime-2, Translation, and Whisper Models

ai-technology · 2026-05-07

OpenAI has introduced three new audio models in its API: GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper. GPT-Realtime-2 is the first voice model with GPT-5-class reasoning, featuring parallel tool calls, longer context (128K), adjustable reasoning effort, and improved recovery behavior. It scored 15.2% higher on Big Bench Audio than its predecessor. GPT-Realtime-Translate supports live translation from 70+ input languages into 13 output languages, with Deutsche Telekom testing it for multilingual customer support. GPT-Realtime-Whisper provides low-latency streaming speech-to-text. Pricing: GPT-Realtime-2 at $32/1M audio input tokens and $64/1M audio output tokens; Translate at $0.034 per minute; Whisper at $0.017 per minute. The models are available in the Realtime API, with full EU Data Residency support and enterprise privacy commitments. Developers can test them in the Playground.

Key facts

  • OpenAI launched GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper on May 5, 2026.
  • GPT-Realtime-2 has GPT-5-class reasoning, 128K context, and adjustable reasoning effort.
  • GPT-Realtime-2 scored 15.2% higher on Big Bench Audio than GPT-Realtime-1.5.
  • GPT-Realtime-Translate supports 70+ input languages and 13 output languages.
  • GPT-Realtime-Whisper is a streaming speech-to-text model.
  • GPT-Realtime-2 pricing: $32/1M audio input tokens, $64/1M audio output tokens.
  • GPT-Realtime-Translate costs $0.034 per minute; Whisper costs $0.017 per minute.
  • Models support EU Data Residency and enterprise privacy commitments.

Entities

Institutions

  • OpenAI
  • Zillow
  • Deutsche Telekom
  • Priceline
  • Vimeo

Locations

  • India
  • Japan
  • EU

Sources