Google told Meta in March that its Gemini large‑language model was approaching the limits of the compute resources the search‑engine giant had set aside for the social‑media company. After Meta’s usage continued to climb, the two firms agreed to impose a hard cap, according to sources familiar with the matter.

Meta relies on Gemini for a range of internal tools, from customer‑service chatbots and advertiser‑help interfaces to code‑generation utilities that assist engineers. The model also underpins systems that detect harmful content and flag scams. Officials said Gemini outperformed Meta’s own open‑source Llama models, prompting the decision to adopt Google’s service despite the company’s lack of a proprietary cloud platform.

Without its own cloud business, Meta has been racing to expand its data‑center footprint, pledging a massive $600 billion investment in cloud infrastructure over the next two years. The company’s rapid build‑out has not kept pace with the appetite for AI compute, leaving it dependent on external providers such as Google.

In response to the capacity crunch, Google asked Meta to improve token efficiency. Employees were instructed to reduce unnecessary calls to the model, a move that aims to stretch the available resources while preserving performance for critical workloads.

Google is feeling the pressure too. The search‑engine titan recently struck a $920 million‑per‑month agreement with SpaceX to tap the aerospace firm’s data centers for its Gemini Enterprise offering. The deal underscores how even the biggest AI providers are scrambling for extra horsepower.

Industry analysts note that the surge in token prices has forced many AI users to scale back usage. While high‑volume customers reap the benefits of faster, more capable models, the revenue generated so far covers only a fraction of the operating costs. OpenAI, for example, has yet to see profits despite its rapid growth.

Meta also runs models from Anthropic, such as Claude, for similar tasks, diversifying its AI stack amid the capacity squeeze. The company’s broader strategy reflects a shift toward a multi‑vendor approach as the market grapples with limited compute supply.

The episode illustrates a broader trend: as AI applications become integral to business operations, the demand for raw processing power is outstripping supply. Providers are compelled to manage allocations tightly, and customers like Meta must adapt to stricter usage controls while they continue to invest heavily in their own data‑center capabilities.

This article was written with the assistance of AI.
News Factory APP - agentic news to boost your SEO & AEO.