← Back to News

Tags: Multimodal

Meta launches Muse Spark, a multimodal chatbot built into its apps

Meta launches Muse Spark, a multimodal chatbot built into its apps
Meta unveiled Muse Spark, its first AI model from the Superintelligence Labs, as the new default experience across the Meta AI app and website in the United States. The multimodal chatbot can generate text, images, video and even audio, handling complex constraints while staying socially fluent. Meta says the model prioritizes people and will soon roll out to Instagram, Facebook, Messenger, WhatsApp and upcoming AI glasses, positioning the service as a direct competitor to ChatGPT and Google Gemini inside the platforms billions already use. Read more

Google Labs Unveils Stitch, an AI‑Powered Design Platform for Websites and Apps

Google Labs Unveils Stitch, an AI‑Powered Design Platform for Websites and Apps
Google Labs introduced Stitch, an AI‑driven design tool that lets users create website and app interfaces through natural‑language prompts. Leveraging Gemini models, Stitch supports text and voice input, offers real‑time design critiques, and allows on‑the‑fly edits such as moving buttons or changing colors. Currently in beta, the platform can generate multi‑page layouts, extract design systems from URLs, and export rules via a markdown file. Early testers reported impressive visual results but noted occasional glitches typical of a beta product. Read more

Rumors Hint at ChatGPT 5.1 Launch with Stronger Memory and Multimodal Features

Rumors Hint at ChatGPT 5.1 Launch with Stronger Memory and Multimodal Features
Recent observations of code references labeled "gpt-5-1-thinking" have sparked speculation that OpenAI may soon roll out a new version of its flagship chatbot, dubbed ChatGPT 5.1. The prospective upgrade is expected to expand the model's context window, improve long‑term memory, tighten factual grounding, and offer smoother multimodal interactions that blend text, images, and audio. Users also hope for more reliable automation tools and a steadier conversational style that remembers personal preferences without repeated prompts. While details remain limited, the buzz reflects anticipation that the next iteration will address several lingering shortcomings of GPT‑5. Read more

Google Gemini Expands as Multimodal AI Rival to ChatGPT

Google Gemini Expands as Multimodal AI Rival to ChatGPT
Google Gemini is positioning itself as a direct competitor to OpenAI's ChatGPT by offering deep integration across Google services, real‑time information access, and native support for files, images, and audio. The assistant operates within Chrome, Android, Workspace, and other Google products, allowing users to ask questions about PDFs, Docs, images, and short audio clips without extra steps. Gemini also taps into Google’s vast data ecosystem, offers image generation through the Nano Banana model, and can orchestrate tasks across Maps, messaging, and calendars. While praised for its polished design and multimodal capabilities, some users voice concerns about data privacy. Visits to Gemini have risen sharply, indicating growing interest. Read more

RabbitOS 2 Transforms Rabbit R1 with New Look and AI Features

RabbitOS 2 Transforms Rabbit R1 with New Look and AI Features
Rabbit has launched RabbitOS 2, a major software overhaul for the Rabbit R1 handheld AI device. The update adds a colorful card‑based interface, multimodal voice, text and image interactions, and a new "Creations" platform that lets users build tools, games and experiences through conversational "vibe coding". The update aims to close the gap between early expectations and the original experience, offering a more visual and intuitive way to access the R1’s capabilities while keeping the device’s low price and niche appeal. Read more

NotebookLM Expands Audio Overviews with New Formats

NotebookLM Expands Audio Overviews with New Formats
Google's Gemini-powered AI research assistant NotebookLM is adding three new Audio Overview formats—Brief, Critique, and Debate—to its existing Deep Dive option. The new formats let users receive bite‑size summaries, expert feedback, or a simulated debate between two hosts, all generated from the user’s own notes and sources. This expansion aims to make the tool more flexible for learning and research, building on the popularity of its multimodal capabilities. Read more