← Volver a Noticias

Etiquetas: Mixture of Experts

DeepSeek presenta los modelos V4 Flash y V4 Pro, afirmando el liderazgo en peso abierto

DeepSeek presenta los modelos V4 Flash y V4 Pro, afirmando el liderazgo en peso abierto
Chinese AI lab DeepSeek released two preview versions of its next‑generation large language model, DeepSeek V4 Flash and V4 Pro. Both models use a mixture‑of‑experts architecture and support a 1‑million‑token context window, enabling users to feed entire codebases or long documents into prompts. DeepSeek says V4 Pro, with 1.6 trillion parameters (49 billion active), is the largest open‑weight model on the market, while V4 Flash offers a smaller, more affordable option. The company claims the new models narrow the performance gap with leading closed‑source systems and are priced well below competing frontier models. Leer más

Google lanza los modelos Gemma 4 y cambia a la licencia Apache 2.0

Google lanza los modelos Gemma 4 y cambia a la licencia Apache 2.0
Google introduced the Gemma 4 family of open-weight AI models, offering four variants optimized for local execution and mobile devices. The two larger models—26B Mixture of Experts and 31B Dense—run unquantized on a single 80GB Nvidia H100 GPU and can be quantized for consumer GPUs. Smaller Effective 2B and Effective 4B models target smartphones and edge hardware, benefitting from collaboration with Qualcomm and MediaTek. Google also replaced its custom Gemma license with the Apache 2.0 license, giving developers greater freedom. The company claims Gemma 4 models are the most capable locally runnable AI systems, positioning them near the top of open AI model rankings. Leer más

DeepSeek Presenta Engram para Reducir las Necesidades de Memoria de Alta Banda en Grandes Modelos de Inteligencia Artificial

DeepSeek Presenta Engram para Reducir las Necesidades de Memoria de Alta Banda en Grandes Modelos de Inteligencia Artificial
DeepSeek, in partnership with Peking University, unveiled Engram, a new training method that separates static memory from computation in large language models. By using hashed N‑gram lookups and a context‑aware gating mechanism, Engram reduces reliance on high‑bandwidth memory (HBM), allowing models to operate efficiently on standard GPU memory while scaling parameter counts. Tests on a 27‑billion‑parameter model showed measurable gains across industry benchmarks, and the approach integrates with existing hardware solutions such as Phison’s SSD‑based accelerators and emerging CXL standards. Engram could ease pressure on costly memory hardware and stabilize DRAM price volatility. Leer más

Mistral se acerca a los rivales de la gran IA con una nueva frontera abierta y modelos pequeños

Mistral se acerca a los rivales de la gran IA con una nueva frontera abierta y modelos pequeños
French AI startup Mistral unveiled its Mistral 3 family, featuring a large frontier model with multimodal and multilingual capabilities and nine smaller, fully customizable models. The launch emphasizes open-weight access, allowing developers to run models on a single GPU and fine‑tune them for specific enterprise tasks. Mistral positions its models as cost‑effective alternatives to closed‑source rivals, highlighting efficient architecture, extensive context windows, and suitability for on‑premise deployment. The company also announced collaborations with partners in robotics, cybersecurity, and automotive sectors to integrate its models into specialized applications. Leer más