← Voltar às Notícias

Tags: AI performance

Relatório do OmniCalculator Descobre que o Grok Lidera em Matemática, Enquanto o Claude Supera em Qualidade de Escrita

Relatório do OmniCalculator Descobre que o Grok Lidera em Matemática, Enquanto o Claude Supera em Qualidade de Escrita
A new OmniCalculator benchmark shows xAI's Grok 4.2 outperforms free AI chatbots in logical and math tasks, while Anthropic's Claude 4.6 delivers the best writing consistency. Despite a surge in Claude's popularity amid concerns over ChatGPT's ties to military projects, OpenAI's ChatGPT remains the most widely used model. The study highlights distinct strengths and instability rates across the leading bots, suggesting users may need to match tools to specific tasks rather than seeking a single "smartest" AI. Ler mais

Ollama Adiciona Suporte ao MLX da Apple, Melhora o Desempenho dos Modelos Mac

Ollama Adiciona Suporte ao MLX da Apple, Melhora o Desempenho dos Modelos Mac
Ollama, a runtime for running large language models locally, announced preview support for Apple’s open‑source MLX framework and added Nvidia’s NVFP4 compression format. The update targets Apple Silicon Macs, requiring at least 32 GB of RAM, and currently supports Alibaba’s 35‑billion‑parameter Qwen 3.5 model. These changes aim to improve caching, memory efficiency, and overall speed, aligning with growing interest in running AI models on personal machines amid frustrations with cloud‑based rate limits and subscription costs. Ler mais

O Gemini 3.1 Pro da Google Prioriza um Raciocínio Mais Profundo em Detrimento da Velocidade

O Gemini 3.1 Pro da Google Prioriza um Raciocínio Mais Profundo em Detrimento da Velocidade
Google’s latest Gemini model, Gemini 3.1 Pro, shifts focus from raw speed to more thoughtful problem solving. While the earlier Gemini 3 Pro delivered fast, surface‑level answers, the 3.1 update introduces a “deep think” mode that deliberately slows responses to improve logical depth and handle complex tasks such as abstract reasoning, SVG generation, and intricate logistical planning. Early testing shows the new model excelling in nuanced scenarios where multi‑layered constraints and precise code output are required, positioning it as the preferred choice for developers and power users seeking higher‑quality AI output. Ler mais

IA, Soberania de Dados e Centros de Dados de Borda Metropolitana Redefinem o Cenário Digital da Europa

IA, Soberania de Dados e Centros de Dados de Borda Metropolitana Redefinem o Cenário Digital da Europa
Artificial intelligence is fueling Europe’s digital ambitions, but organizations face a critical need for massive, low‑latency storage that complies with strict data‑sovereignty rules. New regulations such as the GDPR, Data Governance Act and AI Act push firms to keep data within specific jurisdictions, while modern AI workloads demand petabyte‑scale capacity and ultra‑fast access. To meet these twin pressures, Europe is seeing rapid growth in metro‑edge data centers—localized facilities near major population and industrial hubs—that combine high‑density storage, compliance, and proximity to compute resources. This shift toward local‑first, hybrid architectures promises to boost AI performance while satisfying regulatory requirements. Ler mais

Agentes de Codificação de IA Recriam o Clássico Minesweeper com Resultados Mistas

Agentes de Codificação de IA Recriam o Clássico Minesweeper com Resultados Mistas
A test of four AI coding agents tasked with rebuilding the classic Minesweeper game revealed a blend of successes and shortcomings. While the agents successfully implemented core gameplay mechanics such as chording and flagging, they varied in UI polish, sound options, and development speed. OpenAI's Codex, for instance, took roughly twice as long to produce a functional version compared to Claude Code. The evaluation highlights both the promise of AI-driven development and the current limits that developers must still address. Ler mais

Anthropic Lança Claude Haiku 4.5, um Modelo Pequeno Custo-Efetivo

Anthropic Lança Claude Haiku 4.5, um Modelo Pequeno Custo-Efetivo
Anthropic introduced Claude Haiku 4.5, a compact AI model designed to deliver high intelligence and speed at a fraction of the cost of its larger counterparts. Priced at $1 per million input tokens and $5 per million output tokens for API users, Haiku 4.5 undercuts Sonnet 4.5 and Opus 4.1 while matching frontier‑level performance on benchmarks such as SWE‑bench. The model targets real‑time, low‑latency tasks like chat assistants, customer service, and pair programming, and can be combined with Sonnet 4.5 in multi‑model workflows. Documentation and system cards are now available for developers. Ler mais

Next-Gen UFS 5.0 Storage Promises Faster AI Performance in Future Smartphones

Next-Gen UFS 5.0 Storage Promises Faster AI Performance in Future Smartphones
Universal Flash Storage (UFS) 5.0, the upcoming successor to UFS 4.x, is set to double data transfer speeds to around 10.8 GB per second, according to the JEDEC announcement. The boost is aimed at meeting the demands of AI‑driven apps, offering quicker read/write performance while remaining power‑efficient. Though no manufacturers have yet unveiled a device with UFS 5.0, industry analysts expect Samsung to be an early adopter, potentially appearing in a future Galaxy series. Apple is unlikely to adopt the standard, while Google may incorporate it into higher‑capacity Pixel models. The technology could become a key differentiator for next‑generation smartphones. Ler mais

OpenAI Afirma que GPT-5 Se Aproxima do Desempenho Humano no Novo Benchmark GDPval

OpenAI Afirma que GPT-5 Se Aproxima do Desempenho Humano no Novo Benchmark GDPval
OpenAI introduced a new benchmark called GDPval that pits its AI models against human experts across dozens of occupations. In the initial rollout, GPT-5‑high was judged better than or on par with professionals in about 40.6% of tasks, while Anthropic’s Claude Opus 4.1 achieved roughly a 49% win rate. The test covered 44 roles spanning key sectors such as healthcare, finance, and manufacturing. OpenAI says the results show AI can start offloading routine work for many jobs, though it acknowledges the current scope is limited and plans to expand the benchmark’s coverage. Ler mais

Qualcomm Unveils Snapdragon X2 Elite and X2 Elite Extreme, Flagship 3nm Chips Targeting Windows PC Market

Qualcomm Unveils Snapdragon X2 Elite and X2 Elite Extreme, Flagship 3nm Chips Targeting Windows PC Market
At Snapdragon Summit 2025, Qualcomm introduced the Snapdragon X2 Elite and X2 Elite Extreme, its newest 3‑nanometer system‑on‑chips designed for Windows laptops. Both chips feature the third‑generation Oryon CPU, with the Extreme model delivering an 18‑core design, up to 75% faster iso‑power performance, and a neural processing unit capable of 80 trillion operations per second. Qualcomm showcased a range of reference devices, from conventional laptops to novel form‑factors such as frisbee‑shaped and coaster‑sized PCs. The company says the first products using the new silicon will appear in the first half of 2026, positioning Qualcomm to compete directly with Intel and AMD in the high‑performance laptop segment. Ler mais