Tags: voice AI

May 10, 2026

OpenAI unveils GPT‑Realtime‑2 and new voice API models, slashing costs for real‑time translation and transcription

The Next Web

OpenAI announced three new voice‑AI models—GPT‑Realtime‑2, GPT‑Realtime‑Translate, and GPT‑Realtime‑Whisper—bringing GPT‑5‑class reasoning to live audio, real‑time translation in over 70 languages, and low‑latency streaming transcription. The rollout promises faster turn‑taking, parallel tool calls, tone control and a 128K context window, while pricing undercuts most enterprise solutions. Early adopters such as Zillow and BolnaAI report significant gains in call success and word‑error rates, signaling a shift toward integrated, end‑to‑end voice agents. Read more

May 10, 2026

Wispr Flow Bets Big on Voice AI in India Despite Challenges

TechCrunch

Wispr Flow, a Bay Area-headquartered startup, is expanding its voice AI software in India, despite the country's linguistic complexity and uneven monetization patterns. The company sees India as its fastest-growing market and is investing in multilingual voice support and local hiring. Read more

May 8, 2026

OpenAI Unveils Three Real‑Time Voice Models, Expanding AI to Live Conversation, Translation and Streaming Transcription

Digital Trends

OpenAI announced three new audio models for its Realtime API—GPT‑Realtime‑2, GPT‑Realtime‑Translate and GPT‑Realtime‑Whisper. The suite pushes voice AI beyond simple back‑and‑forth exchanges, offering live reasoning, on‑the‑fly translation across 70+ languages and streaming transcription. Developers can now build assistants that schedule home tours, manage travel bookings or provide real‑time captions, while pricing starts at $0.017 per minute for Whisper and $0.034 per minute for Translate, with GPT‑Realtime‑2 billed at $32 per million audio tokens. Read more

May 8, 2026

OpenAI adds real‑time voice, translation and transcription to its API

TechCrunch

OpenAI announced Thursday that its API now supports three new voice‑focused models—GPT‑Realtime‑2, GPT‑Realtime‑Translate and GPT‑Realtime‑Whisper. The suite lets developers build applications that can converse, translate and transcribe speech on the fly, with support for more than 70 input languages and 13 output languages. Billing is split between per‑minute rates for translation and transcription and token‑based pricing for the conversational model. OpenAI says the tools target customer‑service, education, media and creator platforms, and includes guardrails to curb misuse. Read more