Tags: voice AI

Google launches Gemini 3.1 Flash Live, a more human-like conversational voice model

Google launches Gemini 3.1 Flash Live, a more human-like conversational voice model Ars Technica2
Google introduced Gemini 3.1 Flash Live, a real‑time voice model designed to sound more like a person. In Scale AI’s Audio MultiChallenge the model scored 36.1 percent, trailing non‑conversational audio models that exceed 50 percent. The new system embeds SynthID watermarks that are invisible to listeners but detectable for verification. Early partners—including Home Depot and Verizon—reported positive results. Developers can access the model via AI Studio, the Gemini API, and Gemini Enterprise for Customer Experience, with the technology appearing in Gemini Live and Search Live features. Read more

Mistral AI Launches Open-Source Voice Model Voxtral TTS

Mistral AI Launches Open-Source Voice Model Voxtral TTS TechCrunch
Mistral AI, a French artificial‑intelligence firm, has introduced Voxtral TTS, an open‑source text‑to‑speech model designed for real‑time performance on edge devices. The model supports nine languages, can be customized with a voice sample of less than five seconds, and delivers a time‑to‑first‑audio of 90 ms with a real‑time factor of 6×. Mistral positions the model as a low‑cost, high‑quality alternative for enterprise voice assistants, dubbing, and real‑time translation, directly competing with established players such as ElevenLabs, Deepgram, and OpenAI. Read more