Apr 1, 2026

Ollama Adds Apple MLX Support, Boosts Mac Model Performance

Ollama, a runtime for running large language models locally, announced preview support for Apple’s open‑source MLX framework and added Nvidia’s NVFP4 compression format. The update targets Apple Silicon Macs, requiring at least 32 GB of RAM, and currently supports Alibaba’s 35‑billion‑parameter Qwen 3.5 model. These changes aim to improve caching, memory efficiency, and overall speed, aligning with growing interest in running AI models on personal machines amid frustrations with cloud‑based rate limits and subscription costs. Lire la suite

Nov 19, 2025

Apple Enables Mac Clusters for AI Workloads with macOS Tahoe 26.2

Apple’s upcoming macOS Tahoe 26.2 adds a low‑latency Thunderbolt 5 feature that lets multiple Macs operate as a single AI supercomputer. By linking four Mac Studios—each capable of up to 512 GB of unified memory—developers can run massive models such as the 1‑trillion‑parameter Kimi‑K2‑Thinking model using far less power than traditional GPU clusters. The technology also supports M4 Pro Mac mini, M4 Pro/Max MacBook Pro, and the 14‑inch MacBook Pro (though the latter is limited to Thunderbolt 4). The new capability promises substantial performance gains for on‑device AI research without requiring extra hardware. Lire la suite

Tags: MLX

Ollama Adds Apple MLX Support, Boosts Mac Model Performance

Apple Enables Mac Clusters for AI Workloads with macOS Tahoe 26.2