← Torna alle notizie

Tag: speech synthesis

Mistral AI Launches Open-Source Voice Model Voxtral TTS

Mistral AI Launches Open-Source Voice Model Voxtral TTS
Mistral AI, a French artificial‑intelligence firm, has introduced Voxtral TTS, an open‑source text‑to‑speech model designed for real‑time performance on edge devices. The model supports nine languages, can be customized with a voice sample of less than five seconds, and delivers a time‑to‑first‑audio of 90 ms with a real‑time factor of 6×. Mistral positions the model as a low‑cost, high‑quality alternative for enterprise voice assistants, dubbing, and real‑time translation, directly competing with established players such as ElevenLabs, Deepgram, and OpenAI. Leggi di più

Bengaluru Startup Sarvam AI Claims Its Vision Model Beats Gemini and ChatGPT on Indian Language OCR

Bengaluru Startup Sarvam AI Claims Its Vision Model Beats Gemini and ChatGPT on Indian Language OCR
Sarvam AI, a Bengaluru‑based startup, says its Sarvam Vision model outperforms global rivals Gemini and ChatGPT on key optical character recognition (OCR) benchmarks for Indian languages. The model supports all 22 scheduled Indian languages and can handle complex tables, charts, and real‑world scene text. Paired with the Bulbul V3 text‑to‑speech system, which offers 35 local‑accented voices, the company positions itself as a builder of "sovereign AI" tailored to India’s linguistic diversity. Sarvam hopes its technology will help small businesses and government agencies digitize records more accurately and spur broader AI innovation focused on regional needs. Leggi di più

ElevenLabs CEO Declares Voice the Next Major AI Interface

ElevenLabs CEO Declares Voice the Next Major AI Interface
ElevenLabs co‑founder and CEO Mati Staniszewski told attendees at the Web Summit that voice is poised to become the primary way people interact with artificial‑intelligence systems. He highlighted recent advances that let voice models convey emotion and work alongside large language models, and outlined the company’s push toward hybrid cloud‑and‑device processing for wearables and other hardware. Staniszewski also noted partnerships with Meta and warned that deeper voice integration raises privacy and surveillance concerns. Leggi di più