DeepSeek lanza modelos de IA de código abierto que rivalizan con los sistemas líderes de EE. UU.

Chinese startup DeepSeek has released two new AI models—DeepSeek‑V3.2 and DeepSeek‑V3.2‑Speciale—under an open-source license. The models claim performance comparable to GPT‑5 and Gemini 3 Pro on long‑form reasoning, tool use, and dense problem solving while offering a 128,000‑token context window and reduced computational cost through Sparse Attention. Their launch challenges the dominance of U.S. AI firms, sparks regulatory scrutiny in Europe, and raises questions about the future of AI accessibility and geopolitics. Leer más

Oct 1, 2025

DeepSeek Presenta Modelo de Atención Dispersa para Reducir a la Mitad los Costos de Inferencia de API

DeepSeek announced a new experimental AI model featuring Sparse Attention technology that dramatically lowers inference costs for long‑context tasks. The model, released on Hugging Face and accompanied by a research paper on GitHub, uses a lightning indexer and fine‑grained token selection to focus computational resources on the most relevant excerpts. Preliminary tests suggest API call prices can be cut by as much as 50 percent in long‑context scenarios. The open‑weight release invites third‑party validation and positions DeepSeek as a notable player in the ongoing effort to make transformer‑based AI more cost‑effective. Leer más

Oct 1, 2025

DeepSeek Explores Sparse Attention to Reduce AI Compute Costs

DeepSeek is testing a sparse attention technique aimed at cutting the processing costs of large AI language models. By limiting the number of word‑to‑word comparisons, the approach seeks to mitigate the quadratic scaling problem inherent in traditional transformer architectures. The effort could make long‑form interactions more affordable while maintaining the model’s ability to understand context. Leer más

Sep 30, 2025

DeepSeek Unveils Sparse‑Attention Model V3.2‑exp to Halve Inference Costs

DeepSeek announced its experimental model V3.2‑exp, featuring a new Sparse Attention mechanism that dramatically lowers inference expenses for long‑context tasks. The architecture employs a lightning indexer to prioritize excerpts and a fine‑grained token selector to feed a limited attention window, allowing the model to process extensive context with reduced server load. Preliminary tests suggest API calls in long‑context scenarios could cost up to half as much as before. The model is open‑weight and freely available on Hugging Face, inviting independent verification and broader adoption. Leer más