Google lança Gemini Omni, IA multimodal que cria vídeos a partir de texto, imagens e áudio

At its I/O conference, Google announced Gemini Omni, a new family of multimodal models that can generate video, edit photos and create digital avatars from a mix of text, images, audio and video. The first offering, Gemini Omni Flash, produces 10‑second clips and will appear in the Gemini app, YouTube Shorts and the AI Creative Studio Flow. Google says the technology reasons across inputs to deliver realistic, physics‑aware content while embedding a SynthID watermark to combat deepfakes. An API for enterprise users and a higher‑performance Omni Pro model are slated for later release. Ler mais